(navigation image)
Home American Libraries | Canadian Libraries | Universal Library | Community Texts | Project Gutenberg | Children's Library | Biodiversity Heritage Library | Additional Collections
Search: Advanced Search
Anonymous User (login or join us)
Upload
See other formats

Full text of "The theory of correlation as applied to farm-survey data on fattening baby beef"

Historic, archived document 

Do not assume content reflects current 
scientific knowledge, policies, or practices 



UNITED STATES DEPARTMENT OF AGRICULTURE 

BULLETIN No. 504 

OFFICE OF THE SECRETARY 

Contribution from the Office of Farm Management 

W. J. SPILLMAN, Chief 





Washington, D. C. PROFESSIONAL PAPER May 23, 1917 



THE THEORY OF CORRELATION AS APPLIED TO FARM- 
SURVEY DATA ON FATTENING BABY BEEF. 

By H. R. Tolley, Scientific Assistant. 



CONTENTS. 



Page. 

Introduction 1 

Theory of correlation 1 

Computation of the coefficients, 6 



Page. 

Interpretation of the coefficients 8 

Summary 13 



INTRODUCTION. 

This paper sets forth the results of an experiment in applying the 
theory of correlation, hitherto used chiefly in the analysis of biologi- 
cal, sociological, psychological, and meteorological statistics, 1 to the 
study of some of the data of the Office of Farm Management. 

The material for the investigation was obtained from 67 records 
taken during the years 1914 and 1915 from farmers of the corn belt 
who were fattening baby beef for market. 2 The factors considered 
were: The profit or loss per head, the weight, value per hundred- 
weight, value of feed consumed per head, cost at weaning time, and 
date of sale (see Table L). Coefficients of correlation were computed 
for every pair of these factors and used as a measure of the relation- 
ship existing between them. 

THEORY OF CORRELATION. 

The writer will not attempt a detailed explanation of the theory 
of correlation but will discuss briefly the meaning of coefficients of 
correlation and the method by which they are obtained. 

1 Yule, G. U. : " Introduction to the Theory of Statistics," 1912. Yule, G. U. : " On the 
Theory of Correlation," Jour. Roy. Stat. Soc, 1897, p. 812. Davenport, C. B. : " Statisti- 
cal Methods, With Special Reference to Biological Variation," 1914. Hooker, R. H. : 
" The Correlation of the Weather and the Crops," Jour. Roy. Stat. Soc, 1907, p. 1. 
Smith, J. W. ; " Effect of Weather on Yield of Corn," Monthly Weather Review, vol. 42, 
p. 72 ; and " Effect of Weather on Yield of Potatoes," ibid., vol. 43, p. 232. Brown, Wm. : 
" Essentials of Mental Measurement," 1911. 

2 For detailed account of the methods by which these data were obtained and the costs 
computed, see Report 111, Office of the Stcretary, 1916. 

70070°— 17 



2 BULLETIN 504, IT. S. DEPARTMENT OF AGRICULTURE. 

Table I. — Data on cost of producing baby beef. 



Farm No. 


Profit per 


Weight 


Value per 


Total value 

of feed per 

head. 


Cost per 
head at 


Date of sale 


head. 1 


per head. 


hundred- 


weaning 


(months). 2 








weight. 


time. 








Pounds. 










1 


+$12. 07 


785 


S8.35 


$31. 49 


523. 12 


8.7 


2 


- 22.98 


750 


7.75 


29.62 


46.33 


6.3 


3 


+ 2.79 


690 


7.20 


20.90 


30.94 


2.6 


4 


+ 6.07 


820 


8.00 


30.00 


30.02 


6.4 


5 


- 14.05 


S52 


7.52 


31.88 


44.15 


12.7 


6 


- 9.93 


1,000 


9.75 


37.47 


64.34 


12.3 


7 


+ 13.68 


825 


8.50 


32.68 


25.76 


8.8 


8 


+ 15. 15 


825 


8.50 


23.04 


32.64 


8.8 


9 


+ 27.42 


800 


9.50 


31.06 


18.01 


8.5 


10 


- 8.92 


875 


9.75 


59.10 


33.92 


8.8 


11 


- 19.09 


922 


10.14 


70.52 


40.20 


9.9 


12 


+ 18.75 


810 


9.30 


30. 25 


28.08 


9.0 


13 


- 7.07 


1,080 


9.75 


71.01 


38.62 


8.9 . 


14 


+ 13.53 


1,048 


10.11 


47.43 


39.83 


10.0 


15 


+ 38.15 


1,012 


10.35 


49.47 


20.47 


10.9 


16 


+ 9.83 


1,000 


9.75 


40.56 


42.89 


10.0 


17 


+ 19.05 


807 


9.70 


33. 58 


28.43 


4.3 


18 


- 5.73 


915 


9.10 


38.41 


51.36 


8.0 


19 


- 2.39 


910 


9.15 


45.48 


42.17 


5.0 


20 


+ 10.93 


890 


9.70 


48.95 


25.86 


8.2 


21 


+ 9.67 


876 


9.40 


23.91 


43.98 


8.7 


22 


— 3.65 


988 


8.75 


43.95 


38.52 


12.2 


23 


+ 49.37 


1,050 


9.75 


27.08 


27.74 


10.3 


24 


- 7.28 


798 


8.25 


42.84 


30.09 


8.0 


25 


- 43.00 


675 


8.00 


44.71 


50.80 


6.6 


26 


+ 5.65 


689 


7.75 


23.39 


24.61 


5.5 


27 


+ 0.59 


860 


10.00 


49.74 


35. 85 


7. 5 


28 


- 7. 71 


746 


8.00 


, 26.53 


40.30 


6.7 


29 


- 2.78 


850 


8.90 


33.35 


43.46 


6.5 


30 


- 1.36 


890 


7.30 


20.33 


49.80 


3.5 


31 


+ 6.76 


859 


8.15 


27.73 


30.89 


4.0 


32 


+ 7.91 


765 


8.10 


21.02 


30.83 


6.0 


33 


+ 18.07 


744 


9.48 


34.95 


19.51 


7.2 


34 


+ 1.33 


700 


9.00 


31.61 


28.25 


5.0 


35 


- 32.63 


740 


9.25 


45.72 


49.76 


5.9 


36 


+ 12.97 


800 


9.00 


31.00 


26.57 


6.5 


37 


+ 11.15 


800 


7.30 


17.96 


27.11 


2.7 


38 


- 43.71 


740 


8.50 


20.12 


85.66 


5.0 


39 


+ 18.15 


700 


7.75 


9.38 


24.78 


3.8 


40 


- 9.86 


785 


8.25 


33. 70 


38.73 


8.8 


41 


+ 2.18 


656 


8.14 


21.90 


30.09 


4.9 


42 


+ 23.99 


925 


8.60 


26.75 


28.55 


5.9 


43 


- 22.97 


766 


8.40 


54.46 


35.09 


6.9 


44 


- 12.73 


750 


8.50 


18.02 


54. 33 


7.0 


45 


+ 11.80 


805 


8.00 


20.76 


34.62 


5.0 


46 


- 22.90 


924 


9.85 


57.59 


56.63 


8.0 


47 


+ 0.27 


800 


9.25 


48.53 


28.97 


7.6 


48 


+ 5.37 


800 


8.25 


29.79 


29.03 


5.7 


49 


+ 5.33 


862 


10.20 


49.77 


30.90 


10.7 


50 


+ 2.82 


800 


9.00 


24.81 


42.86 


7.0 


51 


+ 16.68 


840 


10.00 


45.54 


19. 54 


8.0 


52 


- 7.07 


840 


9.25 


41.09 


37.53 


7.7 


53 


- 3.09 


650 


7.50 


12.28 


41.66 


3.0 


54 


- 24.04 


775 


8.40 


39.86 


47.90 


6.0 


55 


- 10.45 


741 


8.50 


29.04 


45.68 


6.0 


56 


+ 2.83 


768 


7.20 


24. 53 


29.62 


6.0 


57 


- 0.09 


1.060 


8.30 


47.10 


45.41 


3.0 


58 


+ 3.88 


900 


8.95 


37.02 


42.27 


6.0 


59 


- 6.42 


793 


9.60 


56.46 


27.69 


8.5 


60 


- 8.37 


855 


8.55 


42.40 


40.51 


5.5 


61 


- 27.61 


850 


8.05 


43.83 


51.29 


6.0 


62 


+ 1.64 


915 


8.55 


34.41 


39.39 


6.2 


63 


- 0.18 


811 


8.60 


37.13 


32.90 


8.9 


64 


- 1.00 


775 


8.20 


18.30 


43. 72 


5.0 


65 


+ 8.00 


742 


8.25 


17.39 


34.85 


6.9 


66 


+ 5.55 


827 


8.50 


22.30 


42.05 


7.3 


67 
Average.. 


+ 21.73 


950 


9.50 


33.64 


32.09 


12.0 


+ 0.78 


S34 


8.76 


35.02 


37.01 


7.2 



1 A plus sign before the quantity in this column indicates a gain; a minus sign, a loss. 

2 In order to facilitate computation, the dates have been expressed in months and decimals of a month 
after Jan. 1; i. e., 8.7 indicates Aug. 20, 21, or 22; 6.3 indicates June 8, 9, or 10, etc. 



CORRELATION" AS APPLIED TO FARM-SURVEY DATA. 3 

If, in two series of associated variables, as, say, the profit per 
head and the weight per head in the data under consideration, there 
is a tendency for a high value of the first to be associated with a 
high value of the second, the variables are said to be correlated, and 
the correlation is positive; while if a high value of the first is asso- 
ciated with a low value of the second, and vice versa, the correlation 
is said to be negative, and the best measure yet devised of the amount 
of the correlation is the so-called coefficient of correlation. In Table 
II is shown the calculation of the coefficient of correlation between 
profit and weight per head. 

The method is as follows : 

1. Find the average value for each of the variables. Here the, average 
profit per head is $0.78, and the average weight 834 pounds. 

2. Calculate the departure of the individual values from the average. In 
the case of record No. 1, the departure of the profit from the average is +$11.29, 
and of the weight, —49 pounds. 

3. Find the square root of the average of the squares of these departures. 
This is the so-called " standard deviation," and is a measure of dispersion or 
the amount of variability of each variable. 

4. Find the algebraic sum of the products of each pair of individual depart- 
ures, i. e., for each record, multiply the departure of the profit from the average 
by the departure of the weight from the average, and prefix the proper sign; 
then find the difference between the sum of all the plus products and the 
sum of all the minus products. 

5. Divide this result by the number of records and the standard deviation 
of each of the variables in turn, prefix the proper sign, and the figure obtained 
is the coefficient of correlation between the two factors under consideration. 

If there are approximately the same number of positive and nega- 
tive products and they are of the same size, it will be evident that 
there is no correlation, and this will be shown by the fact that the 
coefficient of correlation will be zero, or nearly so. If high values 
of the first variable are associated with high values of the second, 
and low values of the first with low values of the second, most of 
the products will be plus, and the greater their sum the closer will 
be the correlation and the larger will be the coefficient obtained. 
If a value of one variable below the average is generally associated 
with a value of the other above the average, the correlation will 
evidently be negative, and this will be shown by the fact that the 
sum of the products will be negative, the degree of the correlation 
and the size of the coefficient depending upon the size of this sum. 

Expressed algebraically, the coefficient of correlation, 

.■--?*-; (i) 

where Hxy is the sum of the products above mentioned, n is the num- 
ber of pairs of variables (the same as the number of records) ; a x 



BULLETIN 504, U. S. DEPARTMENT OF AGRICULTURE. 



Table II. — Calculation of coefficient of correlation between profit and weight 

per head. 



Farm Xo. 


Profit 
per head. 1 


X. 


[ 


Weight 
per head. 


y. 


Y 2 - 


xy. 










Pounds. 








1 


+S12.07 


+11.29 


+127.69 


785 


- 49 


+2, 401 


- 553.7 


2 


- 22.98 


-23.76 


566.44 


750 


- 84 


7,056 


+ 1,999.2 


3 


+ 2.79 


+ 2.01 


4.00 


690 


-144 


20. 736 


- 288.0 


4 


+ 6.07 


+ 5.29 


28.09 


820 


- 14 


196 


- 74.2 


5 


- 14.05 


-14.83 


219. 04 


852 


+ 18 


324 


- 266. 4 


6 


- 9.93 


-10.71 


114.49 


1,000 


+ 166 


27, 556 


- 1,776.2 


7 


+ 13.68 


+ 12.90 


166.41 


825 


- 9 


81 


- 116.1 


8 


+ 15.15 


+14.37 


207.36 


825 


- 9 


81 


- 129.6 


9 


+ 27.42 


+26.64 


707.56 


800 


- 34 


1,156 


- 904.4 


10 


- 8.92 


- 9.70 


94.09 


875 


+ 41 


1,681 


- 397. 7 


11 


- 19.09 


-19.87 


396.01 


922 


+ 88 


7,724 


- 1,751.2 


12 


-1- 18.75 


+17.97 


324.00 


810 


- 24 


576 


- 432.0 


13 


- 7.07 


— 7.85 


62.41 


1,080 


+246 


60,516 


- 1,943.4 


14 


+ 13.53 


+12.75 


163.84 


1,048 


+214 


45,796 


+ 2,739.2 


15 


+ 38.15 


+37.37 


1.398.76 


1,012 


+178 


31. 684 


+ 6,657.2 


16 


+ 9.83 


+ 9.05 


81.00 


1.000 


+166 


27. 556 


+ 1,494.0 


17 


+ 19.05 


+18.27 


334.89 


807 


- 27 


729 


- 494.1 


18 


- 5.73 


- 6.51 


42.25 


915 


+ 81 


6,561 


- 526. 5 


19 


- 2.39 


- 3.17 


10.24 


910 


+ 76 


5, 776 


- 243.2 


20 


+ 10.93 


+10.15 


104.04 


890 


+ 56 


3,136 


+ 571.2 


21 


+ 9.67 


+ 8.89 


79.21 


876 


+ 42 


1,764 


+ 373.8 


22 


- 3.65 


— 4.43 


19.36 


988 


+154 


23,716 


- 677. 6 


23 


+ 49.37 


+48.59 


2,361.96 


1,050 


+216 


46, 656 


+10.497.6 


24 


- 7.28 


- 8.06 


65.61 


798 


- 36 


1.296 


+ 291.6 


25 


- 43.00 


-43.78 


1,918.44 


675 


-159 


25,281 


+ 6.964.2 


26 


+ 5.65 


+ 4.87 


24.01 


689 


-145 


21.025 


- 710.5 


27 


+ 0.59 


- 0.19 


.04 


860 


+ 26 


676 


5.2 


28 


- 7.71 


- 8.49 


72.25 


746 


- 88 


7, 744 


+ 748. 


29 


- 2.78 


- 3.56 


12.96 


850 


+ 16 


256 


- 57.6 


30 


- 1.36 


- 2.14 


4.41 


890 


+ 56 


3,136 


- 117.6 


31 


+ 6.76 


+ 5.98 


36.00 


859 


+ 25 


625 


+ 150. 


32 


+ 7.91 


+ 7.13 


50.41 


765 


- 69 


4,761 


- 489.9 


33 


+ 18.07 


+ 17.29 


299.29 


744 


- 90 


8,100 


- 1,557.0 


34 


+ 1.33 


+ 0.55 


.36 


700 


-134 


17, 956 


- 80.4 


35 


- 32.63 


-33.41 


1,115.56 


740 


- 94 


8,836 


+ 3,139.6 


36 


+ 12.97 


+12.19 


148.84 


800 


- 34 


1-156 


- 414.8 


37 


+ 11.15 


+10.37 


108. 16 


800 


- 34 


1,156 


- 353.6 


38 


- 43.71 


—44.49 


1,980.25 


740 


- 94 


8.836 


+ 4.1S3.0 


39 


+ 18.15 


+17.37 


302. 76 


700 


-134 


17, 956 


- 2.331.6 


40 


- 9.86 


-10.64 


112.36 


785 


- 49 


2.401 


+ 519.4 


41 


+ 2.18 


+ 1.40 


1.96 


656 


-178 


31.684 


- 249.2 


42 


+ 23.99 


+23.21 


556. 96 


925 


+ 91 


8,281 


+ 2,111.2 


43 


- 22.97 


-23.75 


566.44 


766 


- 68 


4, 624 


+ 1,618.4 


44 


- 12.73 


-13.63 


184.96 


750 


- 84 


7.056 


+ 1,142.4 


45 


+ 11.80 


+ 11.02 


121.00 


805 


- 29 


841 


- 319.0 


46 


- 22.90 


-23.68 


561.69 


924 


+ 90 


8.100 


- 2,133.0 


47 


+ 0.27 


- 0.51 


.25 


800 


- 34 


1,156 


+ 17.0 


48 


+ 5.37 


+ 4.59 


21.16 


800 


- 34 


1,156 


- 156.4 


49 


+ 5.33 


+ 4.55 


21.16 


862 


+ 28 


784 


+ 128.8 


50 


+ 2. 82 


+ 2.04 


4.00 


800 


- 34 


1,156 


68.0 


51 


+ 16.68 


+15.90 


252.81 


840 


+ 6 


36 


+ 95.4 


52 


- 7.07 


- 7.85 


60.84 


840 


+ 6 


36 


- 46.8 


53 


- 3.09 


- 3.87 


15.21 


650 


-184 


33.S56 


+ 717.6 


54 


- 24.04 


-24.82 


615.04 


775 


- 59 


3,481 


+ 1,463.2 


55 


- 10.45 


-11.23 


125.44 


741 


- 93 


8,649 


+ 1,041.6 


56 


+ 2.83 


+ 2.05 


4.00 


768 


- 66 


4.356 


- 132.0 


57 


- 0.09 


- 0.87 


.81 


1,060 


+226 


51.076 


- 203.4 


58 


+ 3.88 


+ 3.10 


9.61 


900 


+ 66 


4,356 


+ 204.6 


59 


- 6.42 


- 7.20 


51.84 


793 


- 41 


1,681 


+ 295.2 


60 


- 8.37 


- 9.15 


84.64 


855 


+ 21 


441 


- 193. 2 


61 


- 27.61 


-28.36 


806.56 


850 


+ 16 


256 


- 454.4 


62 


+ 1.64 


+ 0.86 


.81 


915 


+ 81 


6,561 


+ 72.9 


63 


- 0.18 


- 0.96 


1.00 


811 


- 23 


529 


+ 23.0 


64 


- 1.00 


- 1.78 


3.24 


775 


- 59 


3,481 


+ 106.2 


65 


+ 8.00 


+ 7.22 


51.84 


742 


- 92 


8,464 


- 662.4 


66 


+ 5.55 


+ 4.77 


23.04 


827 


— 7 


49 


33.6 


67 


+ 21.73 


+20.95 


441.00 


950 


+116 


13, 456 


+ 2,436.0 




Average, 




18.452.16 


Average, 




660. 257 


+30,457.6 




+0.78 




<rx=$16.60 


834 




<r y =991bs. 





:xu 



+30457.6 



'~mr x a v (67) (10.60) (99) 
Er= ± .6745 l -^L= ± .076 



+0.277 



1 A plus sign before the quantity in this column indicates a profit, a minus sign a loss. 
The quantities in the column headed x are given to two places of decimals, but it was 
found that the use of one decimal place would give the quantities in the x- and xy 
columns with sufficient accuracy, and the computations were made accordingly. Thus, 
for farm No. 1, (11.3) s =127.69, and (+11.3) )-49)= —553.7. 



CORRELATION AS APPLIED TO FARM-SURVEY DATA. 5 

is the standard deviation of the first variable; and a y the standard 
deviation of the second. The value of " r " will always be between 
+1 and —1, +1 indicating perfect positive correlation, and — 1 
perfect negative correlation; and to be significant, the value should 
be appreciably greater than its probable error, 

^.±•6745(1-^), (II) 

■y/n 

In the example, r=-\-.277, and its probable error is ±.076, so there 
was a tendency for the heavier calves to return a greater profit, but 
the correlation is by no means perfect. 

PARTIAL CORRELATION. 

A study in which many factors are concerned is not complete 
until it is determined whether or not an apparent correlation, meas- 
ured in the manner explained above, is due to the fact that each 
of the two variables (or factors) under consideration is correlated 
with another or even several other variables. For instance, in the 
data under consideration there is apparently a high correlation be- 
tween the weight of the calves and the value per hundredweight 
received for them (r— +.56), and the question now arises if heavier 
calves really do demand a higher price on the market. This corre- 
lation might be due entirely or in part to the fact that the heavier 
calves in the records obtained were sold at a later date, and that 
the price of cattle in general was higher later in the season; that 
Is, the correlation exhibited here might be due to the fact that both 
weight and price are correlated with date of sale. 

In a problem of this type, where it is necessary to consider simul- 
taneously the relation between three variables and to determine the 
correlation between any two, a coefficient of net or partial correlation 1 
can be determined by the formula — 

r r ab — r ac' r bc /T"m 

Calling the three variables #, 5, and c, the terms of the formula are : 
Tab-c is the coefficient of net correlation between a and h, when the 
effect of c is considered ; r ah is the ordinary coefficient of gross corre- 
lation between a and b and is obtained as explained above ; r ac and r bG 
are the coefficients of gross correlation between a and c and b and <?, 
respectively. Continuing with the example above, let us endeavor to 
determine the degree of correlation between weight and value per 
hundredweight, after taking into account any effect that date of sale 
might have had. In other words, we seek an answer to the question : 

1 Yule, G. U. : " Introduction to the Theory of Statistics," p. 229 et seq. 



6 BULLETIN 504, U. S. DEPARTMENT OF AGRICULTURE. 

What would have been the coefficient of correlation between weight 
and price if all the calves had been sold on the same date ? Calling 
the weight w, the value per hundredweight v, and the date of 
sale d, the gross correlation coefficients are: 1 r wu =+.56; r vd =-\-.61; 
r wd =-\-.60. Applying the formula (III), we have: 
+ .56-( + .61)( + .60) _ J _ 
wv ' d A /(l--61 2 )(l-60 2 ) + 

This value. +.31. is appreciably smaller than the value, +.56, of 
the gross coefficient, showing that the apparent correlation between 
weight and price is partly, but not entirely, due to their mutual 
correlation with the date of sale. 

This theory can be applied to the case of several variables by a 
simple extension of the formula. 2 

In the general case for six variables, the total number considered 
in this paper — 

fgb.cde ^af.cde * ^bf.cde (T\f\ 

a "'"V(i-^.*)(i-'V.*) ' (} 

Tab'cdef is the net coefficient of correlation between a and 6, when the 
four factors, c, d, e, and /, are taken into account: r a& . cde , r a f. C de, and 
r^f.cde are the coefficients of correlation between the two variables 
before the period in each case when e, cU and e are taken into account. 

COMPUTATION OF THE COEFFICIENTS. 

The first step in the arithmetic was the computation of the gross 
correlation coefficients. As stated above, the variables or factors 
considered were: (1) The profit or loss per head; (2) weight; 
(3) value per hundredweight; (4) total value of feed consumed per 
head; (5) cost per head at weaning time; and (6) date of sale. 
These six variables, if taken two at a time, can be combined in 15 
different ways. The first calculation was to find the coefficients of 
correlation between these 15 different pairs. In Table III these are 
the first values given. The effect of every other factor on these 
gross coefficients was then eliminated by successive applications of 
formulae III and IV. As an example, take profit and weight, 
the first pair "of variables correlated. The gross coefficient was first 
corrected for the effect of value per hundredweight, value of feed 
consumed, initial cost, and date of sale, in turn. Then the effect 
of these four factors was considered, taking them two at a time. 
That is, the correlation was determined when both the value per 
hundredweight and the cost of feed were taken into consideration 
at the same time. When the effect of all these factors, taking them 

1 See Table III : Correlation coefficients. 

2 Yule, G. U. : " Introduction to the Theory of Statistics,*' p. 229 et seq. 



COEBELATION AS APPLIED TO FAEM-SUEVEY DATA. 
Table III. — Correlation coefficients. 



Profit (p) and 
Weight (to). 


Profit (p) and 
Value per hun- 
dredweight (v). 


Profit (p) and 
Value of feed (/). 


Profit (p) and 

Cost at weaning 
time (c). 


Profit (p) and 

Date of Sale (d). 


r pw 


+0.28 


r pv 


+0.23 


r P f 


-0.27 


Tpe 


-0,73 


Tpd 


+0.14 


r P w.v 
r pw .f 

Tp w c 
Tpwd 


+ .18 
+ .50 
+ .48 
+ .24 


Tpvw 
Tpvf 

r pv .c 

Tpvd 


+ .10 
+ .56 
+ .25 
+ .19 


r pf .w 

Tpf.v 

r p f. c 
r P f.d 


- .50 

- .58 

- .38 

- .37 


r pc .w 

r pc .v 

Tpc.f 

Tpc-d 


- .78 

- .73 

- .75 

- .73 


r p d-w 

Tpd.v 
Tpd.f 
Tpd-o 


- .03 

.00 

+ .29 

+ .16 


Tpw.vf 

TjJW.VO 

Tpw.vd 
Tpw.fc 
Tpw.fd 
Tpw-cd 


+ .39 
+ .43 
+ .20 
+ .85 
+ .43 
+ .49 


Tpv.wf 

r P vwc 

Tpv.wd 

r P v.fo 
r pv .fd 

T p vcd 


+ .48 
- .04 
+ .12 
+ .71 
+ .50 
+ .18 


r pf .wv 

Tpf.wo 
Tpf.wd 
Tpf.vc 
Tpf.vd 

Tpf-cd 


- .64 

- .83 

- .50 

- .74 

- .58 

- .50 


Tpcwv 

r pc .wf 

r p c-wd 

r pc -vf 

Tpcvd 
Tpc.fd 


- .78 

- .92 

- .79 

- .83 

- .73 

- .77 


Tpd.wv 
Tpd.wf 
r p d-wo 
Tpd-vf • 
r p d-vo 

r P d.fo 


- .08 
+ .06 

- .18 
+ .02 
+ .02 
+ .38 


Tpw.vfc 
Tpw.vfd 

Tpwvcd 
Tpw.fcd 


+ .91 
+ .42 
+ .46 
+ .83 


Tpywfc 
Tpvwfd 
Tpv.wcd 
Tpy.fcd 


+ .83 
+ .49 
+ 04 
+ .65 


Tpf.wvc 
Tpf.wvd 
T p f.wcd 
Tpf.vcd 


- .95 

- .65 

- .83 

- .74 


Tpcwvf 
r P c« wvd 

r pc .wfd 
r pc .vfd 


- .97 

- .79 

- .92 

- .83 


Tpd.wvf 

r p d.wvo 
r P d-wfo 
r P d.vfo 


- .15 

- .19 

- .10 
+ .06 


Tpw.vfcd 


+ .97 


Tpv.wfod 


+ .94 


Tpf.wvcd 


- .98 


Tpc-wvfd 


- .98 


r P d.wvfc 


+ .77 


Weight (w) and 
Value per hun- 
dredweight (»). 


Weight (w) and 
Value of feed (/). 


Weight (w) and 

Cost at weaning 

time (c). 


Weight (w) and 
Date of sale (d). 


Value per hun- 
dredweight O) 
and Value of 
feed (/). 


r wv 


+0.56 


T w f 


+0.51 


Two 


+0.07 


r W d 


+0.60 


r V f 


+0.65 


rwv.p 

Twv.f 
Twv.0 

Twv-d 


+ .53 
+ .35 
+ 57 
+ .31 


r W f.p 

Twf.v 
Twf-c 

r w f.d 


+ .63 
+ .23 
+ .51 
+ .36 


rwcp 

Twcv 
Twcf 

Twc-d 


+ .42 
+ .15 
+ .08 
+ .12 


Twd-p 
r.wd-v 

Twd.f 
Twd-o 


+ .59 
+ .39 
+ .49 
+ .60 . 


r V f.p 

r v f.w 
rvf.o 
r v f.d 


+ .76 
+ .51 
+ .66 
+ .55 


Twv.pf 
Twv.po 
Twv.pd 

r W y.fc 
r W v.fd 

Twvcd 


+ .10 
+ .53 
+ .28 
+ .36 
+ .14 
+ .32 


Twf.pv 
Twf.pc 
Twf.pd 
Twf.vc 
Twf.vd 
Twf-cd 


+ .42 
+ .86 
+ .50 
+ .22 
+ .24 
+ .36 


Twcpv 
Twcpf 
Twc.pd 
Twcvf 
Twc-vd 
Twcfd 


+ .42 
+ .81 
+ .45 
+ .13 
+ .16 
+ .12 


Twd-pv 
Twd.pf 
Twd-po 
Twd.vf 
Twd.vo 
Twd.fo 


+ .40 
+ .42 
+ .61 
+ .39 
+ .39 
+ .50 


rvf.pw 

Tvf.po 
Tvf.pd 
Tvf.wc 
Tvf.wd 
Tvf.cd 


+ .65 
+ .84 
+ .68 
+ .52 
+ .50 
+ .56 


Twv.pfc 
Twv.pfd 
Twv.pcd 
Twv.fcd 


- .67 

- .09 
+ .27 
+ .16 


Twf.pvo 

r w f. P vd 
r W f. P cd 

Twf.vcd 


+ .88 
+ .44 
+ .80 
+ .22 


Twc-pvf 
Twc-pvd 
Twc-pfd 
Twc.vfd 


+ .89 
+ .45 
+ .79 
+ .14 


Twd-pvf 
Twd.pvo 
Twd.pfc 
Twd.vfc 


+ .42 
+ .43 

+ .36 
+ .40 


Tvf.pwo 
Tvf.pwd 
Tvf.pcd 
Tvf.wcd 


+ .88 
+ .65 
+ .77 
+ .50 


Twv.pfcd 


- .90 


Twf.pvcd 


+ .97 


Two.pvfd 


+ .96 


Twd-pvfo 


+ .81 


Tvf.pwod 


+ .97 



Value per hun- 
dredweight (v) 
and Cost at wean- 
ing time (c). 


Value per hun- 
dredweight (v) 
and Date of 
sale (d). 


Value of feed (/) 

and Cost at 
weaning time (c). 


Value of feed (/) 

and Date of 

sale (d). 


Cost at weaning 

time (c) a'ndDate 

of sale (d). 


r v0 -0.09 


r V d +0. 61 


r f o +0. 01 


Tid 


+0 42 


r c d -0.04 


r vc .p + .12 


r vd . P + .60 


r f cp - .28 


Tfd-p 


+ .48 


r c d.p + .09 


rvcw - .16 


rvd.w + .41 


r fc .w - .03 


Tfd-w 


+ .16 


Tcd.w — .11 


r vo .f - -13 


r vd .f + .49 


r f0 .v + .10 


Tfd.v 


+ .03 


r C d.v + .02 


r vo .d - .08 


r V d.o + .61 


Tfc.d + .04 


Tfd.c 


+ .42 


r C d.f — .05 


rvc.pw - .14 


Tvd.pw + .42 


rfcpw - .79 


Tfd.pw 


+ .17 


r C d.pw - .21 


r vc . P f + .54 


r v d. P f + .41 


rfcpv — -58 


Tfd-pv 


+ .04 


Tcd.pv + -03 


r vc . P d + .08 


r V d. pc + -59 


rfcpd — -37 


Tfd-pc 


+ .53 


r C d.pf + .27 


rvcwf - .17 


rvd.wf + .39 


rfcwv + .07 


Tfd-wv 


- .07 


Tcd-wv — .05 


rvcwd - .13 


rvd.we + .40 


Tfcwd - .01 


Tfd.wc 


+ .16 


r d.wf — .10 


rvcfd - .12 


rvd.fe + .48 


rfcvd + -10 


Tfd.vc 


+ .03 


Tcd-vf + -01 


Tvcpwf + .77 


r V d. P wf + .41 


Tfo.pwv — . 92 


Tfd.pwv 


- .15 


Tcd.pwv — .18 


r vc . P wd — .05 


rvd-pwo + .40 


rfcpwd — -77 


Tfd.pwo 


+ .01 


Tod.pwf — 13 


r vc . P fd + .49 


Tvd-pfo + -33 


rfc P vd — .58 


Tfd-pvc 


+ .07 


Tcd.pvf + .06 


Tvcwfd - .14 


rvd-wfe + .38 


rfcwvd + .06 


Tfd.wvo 


- .06 


r d-wvf — .04 


Tvc P wfd + -90 


Tvd.pwfe + .81 


rf . P wvd — .96 


Tfd.pwvc 


- .82 


Tcd-pwvf — .75 



8 BULLETIN 504, U. S. DEPARTMENT OF AGEICULTUEE. 

two at a time, had been considered, they were taken three at a time, 
and finally all four were taken into account simultaneously. In 
all, 260 of the coefficients were computed. Some of them, however, 
are of little interest, and if this were simply a study of the data, 
and not also an exposition of the method used, the computation of 
some of them might have been omitted. 

In order to avoid needless repetition in the tables, the different 
factors are designated by letters as follows: Profit^/?, weight=w, 
value per hundredweight =#, value of feed per head=/, cost at wean- 
ing time=c, date of sale=c/, and the notation for the different coeffi- 
cients is the same as that used in the explanation of the theory, viz : 
r a i)>cd, etc, is the coefficient of correlation between a and b when c, d, 
etc., are taken into account. 

INTERPRETATION OF THE COEFFICIENTS. 

There are four factors, namely, initial cost, value of feed consumed, 
weight, and selling price, which determine almost entirely the profit 
or loss to the farmer in finishing cattle for market. In fattening 
baby beef animals, the weight of the calves and the value of feed 
consumed both depend, to a large extent, on their age when sold 
and the length of time they were on feed. Also the price per pound 
received for them is rather intimately connected with the date 
on which they were sold, prices having had a tendency to rise as the 
season advanced. The calves for which data were gathered were 
all born in the spring, went on feed in the fall or early winter, and 
were sold some time during the following year. Consequently, any 
one of the three factors, age, length of feeding period, and date of 
sale, is a very good measure of the other two, and on account of 
this only the date of sale has been considered. 

If the price per pound, value of feed, initial cost, and date of sale 
were constant, and if nothing else affected the profit, it would vary 
directly with the weight in every case and, according to the theory, 
the coefficient of correlation between the two should be +1. The 
coefficient, r pw . V f Cd , obtained here is +.97. Similarly, if all things 
were constant except value per hundredweight and profit, there 
would be perfect positive correlation between them. The net coeffi- 
cient, I'pv.wfcd* given in Table III is +.94. If all things were constant 
except the value of feed consumed we should expect a high negative 
correlation between it and profit, i. e., the calf that received the least 
feed would return the greatest profit. The net coefficient, r P f. WVC d, is 
—.98. Similarly, other things being equal, perfect negative correla- 
tion should exist between initial cost and profit. The net coefficient 
in this case, rpe-wn, is also —.98. An examination of the remainder 
of these net coefficients, which are the last ones given in the table, 



CORRELATION AS APPLIED TO FARM-SURVEY DATA. 



9 



will show that, with the exception of the five between date of sale 
and the other variables, they are all numerically equal to or above 
.90. It has been shown that part, but not all, of the correlation be- 
tween weight and price was due to the date of sale, and since date 
of sale is only an approximate measure of age and length of feeding 
period, it would not be reasonable to expect the net correlation be- 
tween it and the other variables to be perfect. The fact that all the 
net coefficients except these five are so nearly -[-1 or —1, when 
there was every reason to expect perfect correlation, is striking 
proof of the reliability of this method of analysis as well as of the 
accuracy of data such as those under consideration, and is at the 
same time a very good check on the computations. 

In the interpretation of the coefficients care must be taken to dis- 
tinguish between subjective and relative factors, i. e., between cause 
and effect. Most interest is naturally attached to determining to 
just what extent each of the factors under consideration is respon- 
sible for the farmer's loss or gain in his baby-beef enterprise, and 
here there can be no confusion of cause and effect, for all the other 
factors are necessarily causative. Throughout the remainder of the 
investigation the amount of profit or loss is an effect and not a cause, 
and consequently too much weight should not be given to a coefficient 
in which the effect of profit has been taken into account. 

THE APPARENT CORRELATIONS. 

In taking up the discussion of the coefficients themselves, the ap- 
parent correlations between profit and the other five factors are 

first considered : 

Coefficients of correlation. 



Profit 

and 

Weight. 


Profit 
and 
Value 
per hun- 
dredweight. 


Profit 

and 

Value 

of feed, 


Profit 

and 
Cost at 
weaning 

time. 


Profit 

and 

Date of 

sale. 


+.28 


+.23 


-.27 


-.73 


+.14 



These five coefficients should show the average effect of each of the 
five factors on the profit. The coefficient for profit and date of sale 
(+.14) shows that the profit on the calves sold early in the season was 
practically as great as on those sold later. The first three are all of 
nearly the same size, but are too small to indicate more than slight 
relationship. In regard to them we may say, therefore, that in the 
data under consideration: (1) There was a tendency for the heavier 
calves to return a greater profit; (2) there is some correlation be- 
tween price per pound and profit; (3) generally speaking, the farmer 
whose calves consumed feed worth more than the average made a 
profit somewhat less than the average. 



10 



BULLETIN 504, U. S. DEPARTMENT OF AGRICULTURE. 



A very high degree of correlation between profit and cost at wean- 
ing time is shown by the coefficient —.73, and, as would be expected, 
it is negative. The size of this coefficient as compared to the others 
indicates that the cost of producing the calves and carrying them 
until weaning time is by far the most important factor in determin- 
ing the profit derived by an}' particular farmer from the production 
of baby beef. In all of the records considered the calves were with 
the cows until they went on feed, and there was no expense directly 
chargeable to them. Bearing this in mind, the further statement is 
justified that the cost of maintaining the breeding herd and the size 
of the calf crop have considerably more to do with the profitableness 
of the enterprise than the actual preparation of the calves for market. 

Coefficients of correlation between weight and factors other than profit. 



W 2P w sr 

Value *J* 

perhun- ™» 

dredweight. °* leea ' 


Weight 

and 

Cost at 

weaning 

time. 


Weight 

and 
Date of 

sale. 


+.56 +.51 1 +.07 
i 


+ .60 



The coefficient +.07, for weight and cost at weaning time, is the 
most striking one given here. Its very small size shows that there 
is no connection between the cost of the calves up to the time they 
went on feed and the weights at which they were sold. The cost of 
a calf at weaning time is determined very largely by the manner in 
which the breeding herd is handled, and consequently this coefficient 
shows further that on the farms studied the calves from the herds 
which were maintained at a low cost per head weighed just as much 
when sold as did those from herds having a high maintenance cost. 
The coefficients for weight and value of feed and weight and date of 
sale are what should normally be expected. The calves that received 
more feed than the average weighed more than the average, and the 
ones that were sold in the latter part of the season also weighed more 
than the average. The high correlation, exhibited by the coefficient 
+.56, between weight and price per pound is a surprising one, but it 
will be shown later that it is almost entirely due to the mutual corre- 
lation of these two factors with some of the others. 

The gross coefficient for value per hundredweight and value of 
feed, 1 +.65, shows another apparently high correlation which may 
or may not disappear when some of the other factors are taken into 
account. There is no correlation between value per hundredweight 
and cost at weaning time. The correlation between value per pound 
and date of sale is shown by the coefficient +.61, which confirms 

1 For this and all coefficients mentioned hereafter, see Table III. 



COEEELATION AS APPLIED TO FAEM-SUEVEY DATA. 11 

the statement already made that the price was generally higher later 
in the season. The remaining gross coefficients are +.01 for total 
value of feed consumed per head and cost at weaning time, +.42 for 
value of feed consumed and date of sale, and —.04 for cost at wean- 
ing time and date of sale. The coefficients +.01 and —.04 show that 
cost at weaning time is uncorrelated with either value of feed con- 
sumed or date of sale. With regard to the correlation between value 
of feed consumed per head and date of sale, we may say that the 
value of feed consumed is probably very nearly proportional to the 
length of the feeding period, and if the actual length of time on feed 
had been used here instead of its approximate measure, the date 
of sale, the correlation would probably have been higher. 

EFFECT OF THE OTHER FACTORS ON THE APPARENT CORRELATIONS. 

The small degree of correlation present between profit and weight 
is mostly due to differences in price, the coefficient being reduced 
from +.28 to +.18, when the value per hundredweight is taken into 
account ; that is to say, the tendency of the heavier calves to be the 
more profitable is mostly due to the fact that they sold for a better 
price per pound than that commanded by the smaller calves. 

The coefficient r pw j is +.50, which is considerably higher than the 
gross coefficient, showing that if the value of feed had been constant 
while other things remained unchanged, the correlation between 
profit and weight would have been greater. 

The correct explanation of the size of the coefficient r VWmCJ which 
is +.48, is not so apparent. It indicates, however, that if the in- 
fluence of the cost at weaning time, the factor most closely related to 
profit, were eliminated, the correlation between profit and weight 
would be greater. 

When the date of sale is taken into account, the correlation be- 
tween profit and weight becomes somewhat less than the gross cor- 
relation, but the difference is not enough to be significant. 

The coefficients obtained for the correlation between weight and 
profit, when the effect of the other factors, two at a time, is con- 
sidered, are generally higher than when they are considered one at 
a time. This means that if the influence of two of the factors con- 
tributing to the profit or loss is eliminated, its correlation with any 
of the remaining factors is higher than if the influence of but one 
had been eliminated. 

It is interesting to note here that the correlation between weight 
and profit, even when the other factors are taken into account, is 
almost entirely independent of the date of sale. The apparent cor- 
relation, +.28, becomes +.24 when date of sale is taken into ac- 
count. When value per pound is taken into account, the coefficient 
is +.18; when price and date of sale are considered simultaneously, 



12 



BULLETIN" 504, U. S. DEPARTMENT OF AGRICULTURE. 



*Wc=+-48, and 

'pw.vc I '*<-> 



the coefficient is +.20. Similarly, r pw j= + .50, and r pw j d = + .43 
o.cd= +.49; ^.^=+.39, and r pw . r/d =+.42 
and ^.^-+-46; ^./ c =-f.85, and r pwJcd = +.83 
r pw .vfc= +-91, and r pw . vfcd = +.97. 

The remainder of the coefficients will not be taken up in detail, for 
the same reasoning may be applied as has been used for those between 
profit and weight. The notation is consistent throughout, and the 
arrangement is such that any desired coefficient can be found. 

There does not seem to be any relation between cost at weaning 
time and any of the other factors considered, except profit, and since 
cost at weaning time had more influence on profit than any of the 
others, it might be of interest to know the relationship that would 
have existed between profit and the other factors if the initial cost 
had been constant. 

The coefficients are as follows: 



Tpw.C 


Tpv.c- 


rpf.c. 


Tpd-c. 


+.48 


+ .25 


-.38 


+ .16 



From these coefficients, it is evident that if the initial cost of all 
the calves had been the same, the most important factor in deter- 
mining the profit would have been the weight when marketed; the 
other factors in the order of their importance being the total value of 
feed consumed, the price per pound, and the date of sale. How- 
ever, the correlation between profit and date of sale is still too small 
to be important. 

The statement has already been made that the apparent correla- 
tion between weight and value per hundredweight (7*= +.56) is due 
to the effect of other factors. A study of the coefficients obtained 
when these other factors are taken into consideration shows that 
when the influence of date of sale is eliminated, the coefficient is re- 
duced to +.31 ; when the influence of the value of feed consumed is 
eliminated, the coefficient becomes +.35; and when the two factors 
are taken into account simultaneously, the coefficient is +.14. This 
shows that the quantity of feed consumed per head was responsible 
for nearly as much of this correlation as was the date of sale, and 
that the two together account for practically the whole of it. In 
other words, the value of feed consumed and the date of sale need 
to be considered simultaneously here, because the later the date of 
sale, the longer is the feeding period, and consequently the greater 
the quantity and value of feed consumed. 

The gross correlation between date of sale and value per pound is 
shown by the coefficient +.61, and that between total value of feed 
consumed per head and value per pound, by the coefficient +.65. 



CORRELATION AS APPLIED TO FARM-SURVEY DATA. 13 

These rather large coefficients become very little smaller when all the 
other causal factors are taken into account. Therefore, there must 
be some relationship existing between value per pound and date of 
sale, and value per pound and value of feed consumed. The reason 
for the correlation between value per pound and date of sale has 
already been given. It is probable that the reason for the high corre- 
lation between the value per pound and value of feed consumed is 
due to the fact that the calves which were fed the heaviest ration, 
regardless of the length of feeding period, were the fattest when 
marketed, and consequently sold at a higher price. However, the 
relation between the profit and value of feed consumed per head as 
measured by the correlation coefficient r vf is —.27, and when the in- 
fluence of a longer feeding period is taken into account by elimi- 
nating the effect of date of sale the correlation is still negative 

SUMMARY. 

The results show that data such as those obtained by farm manage- 
ment surveys can be analyzed very thoroughly by the use of the corre- 
lation coefficients. It is generally known before the analysis is at- 
tempted which factors are causal and which resultant, and conse- 
quently there should be very little difficulty in interpreting the coeffi- 
cients correctly. The coefficients of net correlation afford a very good 
means of determining the net effect of each of several factors bearing 
upon a result, or of eliminating the effect of other factors when it is 
desired to find the true relationship existing between any two. 
Although it is not possible to give a definite concrete meaning to cor- 
relation coefficients, they are very concise relative measures of the 
degree of relationship existing between the factors being studied. 
They therefore give the investigator a single index which will show 
what, by the ordinary tabular method, it takes a whole table to show. 
While properly constructed tables will show whether or not any rela- 
tionship exists between two factors, it is a difficult matter to deter- 
mine which of two causes, say, has the greater effect on the result, and 
it is impossible, without a large number of records and a great amount 
of sorting and tabulation, to separate all the factors being considered 
in a study and find the effect that each one would have had if the 
others had not been present, or if they had been constant throughout 
the investigation. If the gross coefficients of correlation between 
every pair of factors have been determined, it is possible to find these 
relationships by simply substituting in the formula for determining 
a net coefficient from the gross coefficients, without any further refer- 
ence to the records themselves. This method should be especially use- 



14 BULLETIN 504, U. S. DEPARTMENT OF AGRICULTURE. 

ful if only a limited number of records or observations are available, 

for it does away with the necessity of sorting into many groups, with 

the consequent falling off in the reliability of the averages obtained. 

The analysis of the data on fattening baby beef animals showed : 

(1) That for the herds considered, the cost of producing the calves 
and carrying them until weaning time was by far the most important 
factor in determining the profit ; 

(2) That there was no connection between the cost at weaning time 
and any of the other factors, for the calves which were produced 
cheaply were seemingly just as good feeders and brought just as good 
a price per pound as the more expensive ones ; 

(3) That the weight at which the calves were sold and the date of 
sale had very little effect on the profit, except for the fact that in the 
two years of the records the price was higher in the latter part of the 
summer, at the time when the heavier calves were put on the market ; 

(4) That the calves which consumed the heaviest ration sold at 
higher prices than the others, but did not return a correspondingly 
greater profit, as the advanced price scarcely offset the extra value 
of feed consumed. 



OTHER PUBLICATIONS OF THE UNITED STATES DEPARTMENT 
OF AGRICULTURE RELATING TO THE SUBJECT OF THIS 
BULLETIN. 

Elementary Notes on Least Squares, the Theory of Statistics and Correlation, 

for Meteorology and Agriculture. (Monthly Weather Review, vol. 44, 1916, p. 

551.) 
Effect of Weather on Yield of Potatoes. (Monthly Weather Review, vol. 43, 

1915, p. 232.) 
Effect of Weather on Yield of Corn. (Monthly Weather Review, vol. 42, 1914, 

p. 72.) 
Methods and Cost of Growing Beef Cattle in the Corn Belt States. (Report No. 

Ill, Office of the Secretary.) 

15 



ADDITIONAL COPIES 

OF THIS PUBLICATION MAY BE PROCURED FROM 

THE SUPERINTENDENT OF DOCUMENTS 

GOVERNMENT PRINTING OFFICE 

WASHINGTON, D. C. 

AT 

5 CENTS PER COPY